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ABSTRACT 

Tracking  moving  objects,  especially  human  objects  in  surveil¬ 
lance  systems  has  attracted  considerable  research  attention. 
This  study  proposes  a  novel  joint  Electro-Optical  (EO)  and 
Infrared  (IR)  cameras  tracking  approach  by  employing  par¬ 
ticle  filter.  A  centroid-based  detection  technique  is  used  to 
discover  potentially  moving  objects  and  obtain  the  coordinate 
data.  Once  moving  targets  are  detected,  both  EO  and  IR  fea¬ 
tures  are  combined  to  extract  object  templates  for  sampling 
particles.  Statistic  information  of  a  blob  centered  at  current 
particle  and  likelihood  of  each  pixels  in  terms  of  foreground, 
background  and  occlusion  components  are  obtained,  to  deter¬ 
mine  and  update  importance  of  each  particle  and  handle  tem¬ 
porary  occlusion.  Hence,  particles  which  can  provide  accu¬ 
rate  prediction  are  assigned  with  higher  weights.  Simulations 
have  been  conducted  to  validate  the  proposed  method. 

1.  INTRODUCTION 

With  growing  security  concerns,  it  is  crucial  to  develop  pow¬ 
erful  surveillance  systems  to  monitor,  detect  and  track  po¬ 
tentially  malicious  events  and  people.  However,  traditional 
visible  EO  video  cameras  fail  to  provide  accurate  informa¬ 
tion  under  certain  situations,  such  as  illumination  variation  or 
occlusion,  given  their  inherent  constraints.  Intuitively,  com¬ 
plementary  information  can  be  obtained  by  deploying  devices 
with  complementary  features  of  video  cameras,  for  example, 
infrared  video  cameras,  which  depend  on  heat  dispersion. 

IR  images  are  normally  of  low  resolution,  therefore  shape 
is  a  major  feature  for  image  data  analysis.  However,  it  can 
monitor  objects  regardless  of  light  conditions,  since  IR  im¬ 
ages  are  formed  on  the  basis  of  heat  emissivity,  conductiv¬ 
ity  of  material  surface,  as  well  as  reflection,  etc..  On  the 
other  hand,  EO  images  have  color,  texture,  and  shape  fea¬ 
tures  of  exploration  potential.  However,  EO  cameras  rely  on 
the  sufficient  illumination  of  environment.  Therefore,  multi¬ 
modality  surveillance  systems  have  attracted  research  atten¬ 
tions  as  possible  solutions  to  improve  detection  and  tracking 
performance. 


Further,  many  existing  military  surveillance  and  monitor¬ 
ing  systems  are  now  equipped  with  both  visible  light  electro- 
optical  (EO)  cameras  and  infrared  (IR)  cameras.  However 
most  of  these  cameras  are  operated  separately,  with  EO  cam¬ 
eras  mainly  producing  day-light  visions,  and  IR  cameras  pro¬ 
ducing  night  visions.  Intuitively  we  recognize  that  EO  and  IR 
cameras  may  provide  complementary  information  if  they  are 
used  jointly.  An  EO  image  mostly  represents  the  intensities  of 
reflected  visible  lights  from  certain  object  in  the  field  of  view, 
while  an  IR  image  captures  the  thermal  profile  of  the  object. 

This  study  proposes  a  new  technique  of  tracking  moving 
objects,  even  with  occlusions  under  weak  illumination  condi¬ 
tions  through  joint  processing  of  EO  and  IR  video  sequences 
using  particle  filters.  The  article  is  organized  as  follows.  Sec¬ 
tion  2  describes  the  basic  ideas  and  algorithms  of  proposed 
moving  object  detecting  and  the  tracking  system.  Section  3 
presents  simulation  profile  and  results.  Finally,  this  article  is 
summarized  in  Section  4. 


2.  OVERVIEW  OF  PREVIOUS  WORK 

Kang  et.al  proposed  a  joint  probability  model  of  EO  and  IR 
cameras  and  used  Kalman  filter  to  resolve  occlusion  prob¬ 
lem  [1]  [2],  Kalman  filter  is  an  optimal  solution  of  linear  sys¬ 
tems  with  Gaussian  noise.  However,  tracking  environments 
are  normally  complex  and  may  not  fit  linear  system  model. 
Shaohua  et  al  proposed  a  particle-filter-based  tracking  system 
with  an  appearance-adaptive  model  [3], 

Khan  et.al  proposed  a  template-based  particle  filter  sys¬ 
tem  to  track  ants  [4].  Compared  with  people,  ants  are  more 
rotation-invariant,  hence  learning  system  should  be  somehow 
different.  James  et.al  studied  multi-target  detection  and  track¬ 
ing  algorithm  by  deploying  boosted  particle  filter  [5].  It  de¬ 
pends  on  color  feature  of  EO  images,  which  are  sensitive  to 
light  condition.  Pupilli  et.al  developed  a  particle  filer  based 
algorithm  to  deal  with  occlusion  of  objects  [6],  It  used  a  large 
number  of  particles  in  state  space  to  track  temporarily  lost 
target. 
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3.  DETECTION  AND  TRACKING  OF  MOVING 
OBJECTS 

In  this  section,  the  fundamental  ideas  and  algorithms  of  de¬ 
tecting,  as  well  as  tracking  moving  objects  are  presented. 

General  framework  of  the  centroid-based  detection  and 
particle-filter-based  tracking  system  is  presented  as  follows, 

•  Register  EO  and  IR  images  by  transformation  and  trans¬ 
lation  algorithms 

•  Detect  moving  objects  by  centroid-based  method 

•  Templates  of  moving  targets  are  obtained  by  combin¬ 
ing  EO  and  IR  counterparts,  if  exist,  to  take  advantage 
of  heterogenous  monitoring  mechanism.  Otherwise,  1- 
modality  data  is  deployed 

•  Particles  are  generated  on  detected  moving  regions  to 
track  further  changes 

•  Moving  objects  are  tracked  by  deploying  particle-filter 
based  technique 

3.1.  Continuous  Detection  of  Moving  Objects 

EO  and  IR  images  were  registered  by  using  piece-wise  lin¬ 
ear  transformation.  Since  it’s  hard,  if  not  possible,  to  physi¬ 
cally  overlap  EO  and  IR  cameras  to  obtain  perfectly  aligned 
images,  image  registration  is  an  essential  step  for  further  an¬ 
alyzing  the  two-modality  (i.e.  EO  and  IR)  images.  The  basic 
approach  is  to  match  landmarks,  so  that  the  same  objects  can 
be  overlapped  in  both  images  captured  by  EO  and  IR  cameras 
respectively  [7], 

Difference  image  between  sequential  frames  is  derived  for 
further  detection  of  movement.  Thereafter,  edge  detection  al¬ 
gorithm  is  used  to  obtain  boundaries  of  potentially  changing 
objects.  To  highlight  the  region  of  great  intensity  change,  i.e. 
possibly  with  edges,  Laplacian  filter  suitable  for  black  and 
white  images  can  be  applied  to  facilitate  further  detection  pro¬ 
cess.  Canny  edge  detection  method,  which  was  known  to  be 
optimal  edge  detection  algorithm  in  terms  of  precision,  was 
used  to  detect  shape  information.  Shape  information  is  useful 
for  establishing  feature  correspondence  between  EO  and  IR 
images. 

After  the  boundaries  of  potentially  moving  objects  are  de¬ 
termined,  the  coordinate  values  of  centroid  of  the  objects  are 
calculated  by  intersecting  the  horizontal  and  vertical  scan  re¬ 
sults.  Once  the  centroid  data  of  changing  regions  is  derived, 
particle  filter  tracking  algorithm  can  be  triggered  to  explore 
suspicious  areas. 

3.2.  Particle  Filter 

Particle  filter  is  a  Monte  Carlo  method.  It  is  used  for  non¬ 
linear  and  non-Gaussian  problems,  which  approximates  con¬ 
tinuous  probability  density  function  by  using  a  great  number 


of  samples,  i.e.  discrete  distribution  approximation.  There  are 
various  types  of  particle  filters,  such  as  the  sequential  impor¬ 
tance  sampling  (SIS),  sampling  importance  resampling  (SIR), 
auxiliary  sampling  importance  resampling,  and  regularized 
particle  filter  [8].  Ellipses  are  selected  to  approximate  suspi¬ 
cious  regions  since  our  main  concern  is  to  track  human  being, 
which  can  be  generally  described  by  this  shape.  Shape  is  a 
major  feature  of  IR  images  of  relatively  low-resolution.  Fur¬ 
thermore,  ellipse  modelling  can  be  extended  to  track  changes 
of  human  body  contours  to  obtain  more  insights  [9]. 

The  basic  procedures  designed  for  our  surveillance-based 
problem  are  described  as  follows, 

1 .  Define  state  and  measurement  equations 

2.  Spread  particles  on  target  regions  to  approximate  con¬ 
tinuous  probability  density  function 

3.  Collect  historical  data  to  make  prediction  of  next  state 

4.  Predict  status  of  subsequent  stage 

5.  Evaluate  real  state  by  measurement  techniques 

6.  Update  weights  of  each  particle  based  on  the  accuracy 
of  the  prediction,  or  the  distance  between  prediction 
and  actual  measurement 

7.  If  degeneracy  problem  occurs,  resampling  technique  can 
be  used 

8.  Goto  4 


3.3.  State  Dynamic  Models 


The  following  state  equation  describes  human  movement  dy¬ 
namics  [10]: 


x  =  =  [rt,st,ft,st] 


/  1  0  r  0  \ 

0  1  0  r 

0  0  ar  0 

\  0  0  0  as 


'where  m  is  noise  vector,  ( rt,st )  denotes  image  coordi¬ 
nate. 

The  measurement  equation  of  object  edges  are  as  follows: 


inputx  =  cx  +  ri  x  cos(Q) 


inputy  =  cy  +  r2  x  sin(9) 


Color  measurement  equation  proposed  by  Yang  et.al  [11] 
is  presented  by 

(n,9i,bi)  =  J2iXtV)GRi((r(x,y),g(x,y),b(x,y)/Ai) 

where  ( cx,cy ),  (inputx ,  inputy)  are  image  coordinates, 
6  is  angel,  rq  and  r2  are  ellipse  radius. 

Movement  prediction  is  updated  on  the  basis  of  average 
shift  of  previous  frames. 


3.4.  Preprocessing  Stage 

A  limited  number  of  samples  are  placed  around  the  ellipse 
to  approximate  posterior  probability  density  function.  When 
sudden  change  occurs,  more  particles  are  required  to  predict 
possible  movements.  Given  the  object  size,  3  concentric  el¬ 
lipses  are  covered  by  the  particles,  i.e.  1  particle  per  degree, 
to  circle  around  an  object.  Based  on  the  property  of  particle 
filter,  more  samples  may  provide  more  precise  approximation 
of  the  actual  probability  density  function.  However,  resource 
consumption  and  computation  are  more  intense  as  well.  In 
addition,  occlusion  problem  involves  even  more  undetermin- 
istic  state  space  change,  therefore,  more  particles  are  neces¬ 
sary  to  predict  possible  behaviors.  The  next  state  is  predicted 
by  computing  state-space  equation.  In  this  case,  the  positions 
of  the  ellipses  are  estimated  accordingly  to  track  moving  ob¬ 
jects. 

3.5.  Tracking  Process  by  Particle  Filter 

Weights  are  computed  to  highlight  the  importance  of  all  the 
particles.  Importance  is  associated  with  accuracy  of  approx¬ 
imating  continuous  probability  distribution  by  discrete  mea¬ 
surement.  Initially,  equal  weights  are  assigned  to  each  parti¬ 
cles.  After  the  actual  measurement  is  obtained,  the  weights  of 
the  particles  are  updated  on  the  basis  of  the  distance  between 
the  probability  distribution  of  the  template  and  the  region  cen¬ 
tered  at  the  current  particle,  as  well  as  the  mixture  likelihood 
of  foreground,  background  and  occlusion  components.  The¬ 
oretically,  the  choice  of  posterior  probability  distribution  is 
essential  for  the  accuracy  of  state  estimation. 

Gaussian  mixture  model  (GMM)  is  deployed  to  present 
objects’  or  templates’  probability  density  distribution.  The 
experimental  results  indicate  that  GMM  is  an  appropriate  can¬ 
didate  to  evaluate  the  accuracy  of  particles  in  terms  of  weight 
update.  The  following  equation  defines  GMM  with  more  than 
one  components  in  ft"  for  n  ^  1  [12]: 

p{x\d)  =  J2m= 1  amP(x\Om),\/x  €  5ft" 

where 

Em=i  =  1,  for  m  =  1, ...,  with  am  ^  0 

and  every  component  follows  normal  distribution 

1  —  (x—  fj,)2 

P(x\0m)  =  — 7=e 
(jy  27 r 

Further,  expectation-maximization  (EM)  algorithm  is  im¬ 
plemented  to  estimate  image  distribution.  GMM  has  been  a 
general  method  to  describe  density  function  of  image  data, 
since  more  than  one  normal  distribution  component  may  present 
the  actual  distribution  more  precisely.  Histogram  of  EO  and 
IR  segments  can  be  used  as  an  imagery  feature  of  the  template 
and  object  of  interest. 


To  use  the  intensity  data  more  efficiently  and  improve 
tracking  performance,  a  mixture  appearance  model  is  incor¬ 
porated  into  the  particle  filter  [3].  The  mixture  model  con¬ 
sisting  of  background,  template,  and  occlusion  elements  can 
provide  a  more  effective  particle  sampling,  as  well  as  handel 
occlusion  case  by  expanding  the  observed  area.  If  a  potential 
occlusion  is  inferred,  more  particles  are  necessary  to  accu¬ 
rately  track  further  movements  of  the  observed  objects.  The 
weight  of  each  pixel  is  updated  based  on  the  mixture  likeli¬ 
hood,  given  the  three  components.  In  this  way,  particles  are 
sampled  around  the  area  with  higher  likelihood  of  reappear¬ 
ance  of  the  object. 

Joint  expectation  maximization  of  EO  and  IR  is  a  promis¬ 
ing  method  to  take  advantage  of  multi-modality.  Further,  RGB 
color  feature  of  EO  images  is  a  good  compliment  for  IR  im¬ 
ages  in  terms  of  visual  properties,  since  resolution  of  IR  im¬ 
ages  are  gernally  lower.  However,  IR  images  are  less  suscep¬ 
tible  to  light  change,  since  thermal  cameras  are  susceptible  to 
heat  emission,  instead  of  light  conditions.  In  addition,  IR  im¬ 
ages  can  provide  a  foundation  for  simple  and  accurate  change 
analysis,  given  the  empirical  results.  Once  targets  are  local¬ 
ized,  shape  correspondence  is  the  key  to  relate  EO  and  IR 
image  data.  Visual  features  of  EO  images  can  be  combined 
and  explored  to  obtain  more  insights  of  observed  scenario. 

After  statistic  data  of  template  and  particle-centered  re¬ 
gions  is  derived,  the  distance  between  probability  distribu¬ 
tions  is  computed  to  determine  weights.  Kullback-Leibler 
Divergence  (KLD)  measures  the  distance  between  the  orig¬ 
inal  probability  distribution  and  a  candidate  probabilty  distri¬ 
bution  [13].  It  is  defined  as 


DKl(P,  Q)  =  Y1  P^)lo9P(f)  ~  p{f)lo9Q(i) 

i 

where  P  is  probability  density  function  of  the  templates,  and 
Q  represents  the  particle’s  probability  density  function. 

As  to  the  proposed  tracking  framework  of  particle  filter, 
the  distance  between  probability  distribution  of  the  template, 
i.e.  the  original  one,  and  that  of  the  region  centered  at  a  par¬ 
ticular  particle  is  measured.  Theoretically,  particles  close  to 
the  true  centroid  have  similar  probability  distributions,  and 
therefore  deserve  higher  weights  to  effectively  allocate  lim¬ 
ited  resources. 

Weights  associated  with  each  particles  are  calculated  by 

w'lt  =  wltp{yt\x\ ) 

Thereafter,  weights  are  normalized  as 

w\  =  w\h,  where  7  =  wt  ) 

Posterior  probability  distribution  is  calculated  by  [10] 

p(xt\zi-.t)  =  X)»= 1  NswlS(xt  -  x\) 


where  xt  is  state  at  time  t;  z-\  :t  denotes  history  of  observation; 
and  w\  are  the  weights  of  particles. 

In  the  case  of  sample  degeneracy,  resampling  technique  is 
deployed  [10]. 

4.  TESTING  DETECTION  AND 
PARTICLE-FILTER-BASED  TRACKING  SYSTEM 

In  the  previous  section,  the  detection  and  tracking  techniques 
of  moving  targets  are  discussed.  Simulation  procedure,  re¬ 
sults,  and  implications  are  presented  in  this  section. 

Detection  system  and  particle-filter-based  tracking  sys¬ 
tem  were  implemented  and  tested  by  Matlab.  OTCBVS04 
DATASETs  [14]  were  selected  to  conduct  simulation,  since 
both  EO  and  corresponding  IR  images  are  provided  for  fur¬ 
ther  exploration.  Temporary  occlusion  case  was  tested  by 
playing  back  old  frames  and  remixed  frames,  while  weak  il¬ 
lumination  region  was  selected  for  testing.  Testing  procedure 
followed  the  proposed  detecting  and  tracking  technique.  Sim¬ 
ulation  results  in  Figure  1,  3,  and  4  show  that  moving  parts  of 
human  body  can  be  detected  and  tracked.  However,  there  was 
a  false  alarm  on  detecting  phase,  where  noise  was  captured. 


Fig.  1.  Tracking  Moving  Parts  of  Human  Body 


Particles  making  greater  contribution  to  predictions  gain 
more  weights  than  others,  hence  limited  resources  can  be  used 
more  efficiently.  Further,  changing  parts  of  body,  such  as  legs 
attract  more  attention.  Temporary  occlusion  under  weak  illu¬ 
mination  case  was  handled  by  placing  more  particles.  Like¬ 
lihood  function  with  the  three  components  is  helpful  in  de¬ 
termining  weights  of  particles.  State  dynamics  considering 
previous  movements  are  more  efficient  in  prediction.  The 
testing  result  is  shown  in  Figure  2.  Hence,  it  can  be  used 
to  obtain  more  details  on  moving  objects.  This  work  will  be 
extended  by  deploying  more  ellipses  to  describe  contour  of 
human  body  and  monitoring  interacting  targets  in  future. 

5.  CONCLUSION 

In  conclusion,  this  work  introduced  a  new  multi-modal  object 
tracking  method  based  on  particle  filters.  A  movement  detec- 


Fig.  2.  Tracking  Output  after  Temporary  Occlusion  and  the 
Statistic  of  the  Target 


Fig.  3.  Tracking  Results  of  the  First  Sequence 


tion  and  tracking  system  based  on  joint  EO  and  IR  cameras 
was  developed.  Centroid-based  algorithm  was  implemented 
to  detect  the  changes  and  collect  the  templates.  Particle  filter 
was  used  to  track  moving  objects,  which  aimed  at  resolving 
temporary  occlusion  with  insufficient  illumination  problem. 
Our  study  has  shown  that  particle  filter  appears  to  be  a  promis¬ 
ing  mathematical  framework  for  multi-modal  data  fusion,  in 
which  observations  and  features  from  different  modalities  are 
used  to  estimate  the  joint  posterior  probability  at  each  track¬ 
ing  stage.  Simulation  results  showed  that  the  system  can  de¬ 
tect  moving  object,  and  track  it,  even  under  temporary  oc¬ 
clusion  case.  This  approach  can  be  more  efficient  than  the 
simple  integration  of  separate  tracking  results  from  individual 
modalities,  and  it  may  have  great  potential  in  most  military 
multi-sensor  systems.  One  of  our  future  works  is  to  extend 
this  method  for  acoustic  and  visual  sensor  data  fusion. 
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Fig.  4.  Tracking  Results  of  the  Second  Sequence 
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